fix(server): different graph name share the same backend by JisoLya · Pull Request #3027 · apache/hugegraph

JisoLya · 2026-05-15T06:26:20Z

Purpose of the PR

close hugegraph-server 启动不正常，查看日志有报错 #3007

During backend initialization, the server creates two graphs — ~sys_graph and hugegraph (default) — that share the same store backend. This leads to a key collision:

~sys_graph writes server node info with id server-1
~sys_graph reads back server-1 successfully
hugegraph writes server node info with the same id server-1
hugegraph reads from the shared table — retrieves ~sys_graph's record instead → type mismatch → parse error

Main Changes

Change ServerInfoManager.selfNodeId() which returns "server-1" previously to "{graphname}/server-1"

Upgrade impact

This change namespaces the server id by graph name, so old unfinished tasks in the non-PD local scheduler may still reference the previous bare server id, for example server-1.

Those historical tasks can remain visible, but they may not be restored or cancelled by the new namespaced server id after upgrade. The impact is limited to unfinished local-scheduler tasks that already existed before upgrading; newly scheduled tasks use the new namespaced id. To avoid this compatibility edge case, finish or cancel pending local tasks before upgrading.

Verifying these changes

Trivial rework / code cleanup without any test coverage. (No Need)
Already covered by existing tests, such as (please modify tests here).
Need tests and can be verified as follows:
- xxx

Does this PR potentially affect the following parts?

Dependencies (add/update license info & regenerate_known_dependencies.sh)
Modify configurations
The public API
Other affects (typed here)
Nope

Documentation Status

Doc - TODO
Doc - Done
Doc - No Need

…ackend

imbajin

修复方向正确，用图命名空间前缀隔离 server ID 能解决跨图 key 碰撞问题。但存在一个阻塞性逻辑缺陷 + 测试缺失，建议修复后再合并。

‼️ 测试覆盖不足（阻塞）

PR 勾选了"已有测试覆盖"，但全量搜索 hugegraph-server/hugegraph-test/ 后，没有任何测试覆盖"两图共享同一 backend + 相同 nodeId"的场景。

建议补充回归测试，至少验证：

~sys_graph 和 hugegraph 使用相同 nodeId（如 "server-1"）初始化时，selfNodeId() 返回值互不相同
两个图能并发启动，不抛出 parse error 或 "already in cluster" 异常

⚠️ 升级兼容性说明缺失

存量部署 backend 中的 ServerInfo vertex key 格式为 server-1；升级后代码改为以 DEFAULT-hugegraph/server-1 查询，旧记录查不到便重新写入新记录，旧的 server-1 记录以孤儿形式留在 backend 中不被清理。对功能影响有限，但建议在 PR 描述或 Release Notes 中注明，避免运维困惑。

🧹 小建议：spaceGraphName 字符集

spaceGraphName() 的值由用户配置拼接而来，没有对 / 字符的校验。若用户将 graphSpace 配置为含 / 的字符串，生成的 ID 会出现多个 /，可能引发后续解析歧义。建议加一行断言或在文档中说明 graphSpace 不允许包含 /。

imbajin · 2026-05-15T07:19:38Z

+        // Scope server id to graph to avoid cross-graph collision
+        return IdGenerator.of(this.graph.spaceGraphName() + "/" +
+                             this.globalNodeInfo.nodeId().asString());
    }


‼️ Critical：initServerInfo() 中读写路径 ID 不一致（阻塞性问题）

此处修改后 selfNodeId() 返回命名空间化的 ID（如 DEFAULT-hugegraph/server-1），但调用方 initServerInfo() 的存在性检查仍使用原始裸 ID，两条路径的 key 不同：

// initServerInfo()（本 PR 未修改这部分） Id serverId = nodeInfo.nodeId(); // ❌ 裸 ID："server-1" HugeServerInfo existed = this.serverInfo(serverId); // ❌ 用 "server-1" 查询 // ... this.globalNodeInfo = nodeInfo; // line 140，赋值后 selfNodeId() 才变 this.saveServerInfo(this.selfNodeId(), ...); // ✅ 保存 "DEFAULT-hugegraph/server-1"

后果：

lines 109–122 的 alive 检查（防止同名节点重复启动的保护）永远查不到已命名空间化的旧记录，保护逻辑被静默跳过

重启时旧的命名空间记录不会被正确识别，会直接写入新记录，旧记录成为孤儿数据积累在 backend 中

建议修法： 在 initServerInfo() 入口先设置 globalNodeInfo，再统一用 selfNodeId() 做后续查询，保持读写 key 一致：

public synchronized void initServerInfo(GlobalMasterInfo nodeInfo) { E.checkArgument(nodeInfo != null, "The global node info can't be null"); this.globalNodeInfo = nodeInfo; // 先赋值，selfNodeId() 立即返回命名空间 ID Id serverId = this.selfNodeId(); // ✅ 与 save / remove 路径使用同一个 key HugeServerInfo existed = this.serverInfo(serverId); if (existed != null && existed.alive()) { // ... 存活检查 & 等待逻辑不变 ... } E.checkArgument(existed == null || !existed.alive(), "The server with name '%s' already in cluster", serverId); // master 唯一性检查不变 ... this.saveServerInfo(serverId, this.selfNodeRole()); }

这样读取和写入始终使用同一个 ID，alive 检查和 master 唯一性保护才能真正生效。

codecov · 2026-05-15T07:42:59Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 93.25%. Comparing base (e108076) to head (b4ede79).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files

@@              Coverage Diff              @@
##             master    #3027       +/-   ##
=============================================
+ Coverage     35.85%   93.25%   +57.40%     
+ Complexity      338       65      -273     
=============================================
  Files           802        9      -793     
  Lines         67995      267    -67728     
  Branches       8902       22     -8880     
=============================================
- Hits          24381      249    -24132     
+ Misses        41008        8    -41000     
+ Partials       2606       10     -2596

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

imbajin

+1, the key collision fix now scopes the server id by graph name and the regression test covers the core case. I added the upgrade-impact note for old local-scheduler tasks; the remaining compatibility edge is limited to unfinished non-PD/local tasks that existed before upgrade, so this looks acceptable to merge from my side.

fix(server): fix ~sys_graph and default graph shared the same store b…

5e4af1d

…ackend

imbajin reviewed May 15, 2026

View reviewed changes

resolve code review

52df303

dosubot Bot added the size:S This PR changes 10-29 lines, ignoring generated files. label May 15, 2026

unit test

b4ede79

dosubot Bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:S This PR changes 10-29 lines, ignoring generated files. labels May 16, 2026

imbajin approved these changes May 17, 2026

View reviewed changes

dosubot Bot added the lgtm This PR has been approved by a maintainer label May 17, 2026

imbajin merged commit 66e5339 into apache:master May 17, 2026
17 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(server): different graph name share the same backend#3027

fix(server): different graph name share the same backend#3027
imbajin merged 3 commits into
apache:masterfrom
JisoLya:fix-parse-err

JisoLya commented May 15, 2026 •

edited by imbajin

Loading

Uh oh!

imbajin left a comment

Uh oh!

imbajin May 15, 2026

Uh oh!

codecov Bot commented May 15, 2026 •

edited

Loading

Uh oh!

imbajin left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

JisoLya commented May 15, 2026 • edited by imbajin Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose of the PR

Main Changes

Upgrade impact

Verifying these changes

Does this PR potentially affect the following parts?

Documentation Status

Uh oh!

imbajin left a comment

Choose a reason for hiding this comment

Uh oh!

imbajin May 15, 2026

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

imbajin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

JisoLya commented May 15, 2026 •

edited by imbajin

Loading

codecov Bot commented May 15, 2026 •

edited

Loading